[BugFix] Directly Convert Modifiers to Recipe Instance by rahul-tuli · Pull Request #1271 · vllm-project/llm-compressor

rahul-tuli · 2025-03-20T00:52:35Z

Currently, the process of recipe creation follows this sequence:

Modifiers → String (Serialization) → Recipe Instance (Deserialization)

This intermediate serialization and deserialization step introduces issues when dealing with more complex objects, such as SmoothQuant mappings, which can lead to parsing errors.

Solution

This PR refactors the flow to directly construct the Recipe Instance from Modifiers, thereby removing an unnecessary conversion step and eliminating a potential source of error.

Issue Tracking

This issue was originally surfaced in [vllm-project/llm-compressor#37](#37) and is formally tracked under [INFERENG-358](https://issues.redhat.com/browse/INFERENG-358).

Testing

The issue was reproduced using the following script, which previously errored out but now runs successfully with this fix:

from datasets import load_dataset
from transformers import AutoModelForCausalLM, AutoTokenizer
from llmcompressor import oneshot
from llmcompressor.modifiers.quantization import GPTQModifier
from llmcompressor.modifiers.smoothquant import SmoothQuantModifier

DATASET_ID = "HuggingFaceH4/ultrachat_200k"
MODEL_ID = "bigscience/bloom-3b"
DATASET_SPLIT = "train_sft"
NUM_CALIBRATION_SAMPLES = 512
MAX_SEQUENCE_LENGTH = 2048

# Load model and tokenizer
model = AutoModelForCausalLM.from_pretrained(
    MODEL_ID, device_map="auto", torch_dtype="auto"
)
tokenizer = AutoTokenizer.from_pretrained(MODEL_ID)

# Define quantization recipe
recipe = [
    SmoothQuantModifier(
        smoothing_strength=0.8,
        mappings=[
            (["re:.*query_key_value"], "re:.*input_layernorm"),
            (["re:.*dense_h_to_4h"], "re:.*post_attention_layernorm"),
        ],
    ),
    GPTQModifier(
        scheme="W8A8",
        targets="Linear",
        ignore=["lm_head"],
        dampening_frac=0.003,
    ),
]

# Load and preprocess dataset
dataset = load_dataset(DATASET_ID, split=DATASET_SPLIT)
dataset = dataset.shuffle(seed=42).select(range(NUM_CALIBRATION_SAMPLES))

def preprocess(example):
    """Formats the messages into a simple dialogue format."""
    text = "\n".join([msg["content"] for msg in example["messages"]])
    return {"text": text}

dataset = dataset.map(preprocess)

# Apply quantization
oneshot(
    model=model,
    dataset=dataset,
    recipe=recipe,
    output_dir="bloom-3b-gptq-w8a8",
    max_seq_length=MAX_SEQUENCE_LENGTH,
    num_calibration_samples=NUM_CALIBRATION_SAMPLES,
)

With this fix, the script now runs to completion without errors. Automated tests have also been added to test new changes

… Recipe Add e2e tests for recipe parsing Signed-off-by: Rahul Tuli <rahul@neuralmagic.com>

github-actions · 2025-03-20T00:52:44Z

👋 Hi! Thank you for contributing to llm-compressor. Please add the ready label when the PR is ready for review.

Note: This is required to complete the testing suite, please only add the label once the PR is code complete and local testing has been performed.

brian-dellabetta

looks good! couple suggestions based on the parts i understand

src/llmcompressor/recipe/modifier.py

src/llmcompressor/recipe/recipe.py

Signed-off-by: Rahul Tuli <rahul@neuralmagic.com>

brian-dellabetta

cool, thanks!

kylesayrs

Nice

src/llmcompressor/recipe/stage.py

Refactor Recipe creation flow: Directly convert Modifier Instances to…

6dd80f8

… Recipe Add e2e tests for recipe parsing Signed-off-by: Rahul Tuli <rahul@neuralmagic.com>

rahul-tuli requested review from brian-dellabetta, dsikka, horheynm and kylesayrs March 20, 2025 00:53

rahul-tuli added the ready When a PR is ready for review label Mar 20, 2025

brian-dellabetta reviewed Mar 20, 2025

View reviewed changes

src/llmcompressor/recipe/modifier.py Show resolved Hide resolved

src/llmcompressor/recipe/recipe.py Outdated Show resolved Hide resolved

rahul-tuli added 2 commits March 25, 2025 15:48

Address review comments from @brian-dellabetta

1fa3d19

Signed-off-by: Rahul Tuli <rahul@neuralmagic.com>

Raise: error if group not found

8108a92

Signed-off-by: Rahul Tuli <rahul@neuralmagic.com>

rahul-tuli self-assigned this Mar 25, 2025

brian-dellabetta approved these changes Mar 25, 2025

View reviewed changes

rahul-tuli enabled auto-merge (squash) March 26, 2025 14:29

kylesayrs approved these changes Mar 26, 2025

View reviewed changes

src/llmcompressor/recipe/stage.py Show resolved Hide resolved

Merge branch 'main' into bugfix-modifier-parsing

13ed1a0

dsikka disabled auto-merge April 2, 2025 16:48

Merge branch 'main' into bugfix-modifier-parsing

d4ce762

dsikka enabled auto-merge (squash) April 2, 2025 17:23

dsikka merged commit 027caa4 into main Apr 2, 2025
8 checks passed

dsikka deleted the bugfix-modifier-parsing branch April 2, 2025 17:42

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[BugFix] Directly Convert Modifiers to Recipe Instance#1271

[BugFix] Directly Convert Modifiers to Recipe Instance#1271
dsikka merged 5 commits intomainfrom
bugfix-modifier-parsing

rahul-tuli commented Mar 20, 2025

Uh oh!

github-actions bot commented Mar 20, 2025

Uh oh!

brian-dellabetta left a comment

Uh oh!

Uh oh!

Uh oh!

brian-dellabetta left a comment

Uh oh!

kylesayrs left a comment

Uh oh!

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

4 participants

Conversation

rahul-tuli commented Mar 20, 2025

Solution

Issue Tracking

Testing

Uh oh!

github-actions bot commented Mar 20, 2025

Uh oh!

brian-dellabetta left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

brian-dellabetta left a comment

Choose a reason for hiding this comment

Uh oh!

kylesayrs left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

4 participants